Where is the CRDT for syntax trees

While the community around distributed data structures are settling on CRDTs as the future,

I’m surprised there is not much CRDT literature/projects for dealing with rich text. Fortunately

Ink and Switch recently released their work on a CRDT dedicated for rich-text and it got me

excited!

While Peritext has got a lot of things right with how content and formatting is weaved together,

dealing with block elements like lists and tables was declared as a work in progress. To be

honest, handling semantic blocks like lists and tables is a different beast altogether.

I’ll explain why.

When you are working on plain text, the operations are dead simple. There could be insertions

- ins(character, index) or deletions - del(index, number_of_characters) and that is all.

Most CRDT literature take this as a classic example and show how their algorithm satisfies

eventuality constraints with this simple data model. When you bring in formatting the

complexity goes one step further where the algorithm has to deal with different formatting

options and generate different sensible outcomes tailored for each case. With formattings we

are now dealing with two dimensional boundaries

applyFormat(formatType, startIndex, endIndex)

When we move on to block elements, we are now making the content of the document into a

tree (like how most UI representations are trees). To be specific, the document is now a proper

tree with flat character arrays existing at the leaf nodes. Imagine DOM.

Suddenly the complexity of CRDT that we should come up with has increased multi-fold. By

now it has to worry about node additions, node deletions, node movements - all these have to

play smoothly with the existing character array and formatting operations.

To put it in perspective, the algorithm might end up having to compose a delete character

operation, a node addition operation and a formatting operation - all landing concurrently

with overlapping locations and somehow the algorithm has to figure out how to resolve this

with the most sensible outcome - all while keeping your rich-text tree semantically correct- i.e

renderable by the rendering program. For example imagine a user splitting a cell into two, and

another user concurrently deleting the entire column the cell was placed in. This should never

result in a tree (table) with a cell hanging out of a column. That is technically not renderable.

This last part about keeping the tree renderable is the key. That is the reason why we would not

be able to apply generic JSON based CRDTs like Automerge. Automerge will have zero idea

about what will make a renderable tree. While Automerge will always give you an eventually

consistent JSON tree, it cannot guarantee the tree follows all the rules of your grammar.

Talking about trees with certain grammar, does it ring any bells? Yes, ASTs! Almost all abstract

syntax trees of programming languages are trees that follow a specification or a grammar. If

they don’t fall under that grammar it most likely is a syntax error of the source text.

So at this point this is the generalization I’ve arrived at – When we are able to arrive at a CRDT

for syntax trees, we’ll be able to make collaborative programming always auto-resolve with the

syntactically correct program (hypothetically) and we’ll also be able to deal with block elements

in rich text in such a way the resolved rich-text data structure is always renderable!

Hopefully this puts the problem into a bigger picture and makes it really worth solving!

- Joe